30 research outputs found

    Modelling Heterogeneous DSP–FPGA Based System Partitioning with Extensions to the Spinach Simulation Environment

    Get PDF
    In this paper we present system-on-a-chip extensions to the Spinach simulation environment for rapidly prototyping heterogeneous DSP/FPGA based architectures, specifically in the embedded domain. This infrastructure has been successfully used to model systems varying from multiprocessor gigabit ethernet controllers to Texas Instruments C6x series DSP based systems with tightly coupled FPGA based coprocessors for computational offloading. As an illustrative example of this toolsets functionality, we investigate workload partitioning in heterogeneous DSP/FPGA based embedded environments. Specifically, we focus on computational offloading of matrix multiplication kernels across DSP/FPGA based embedded architectures

    Hardware/Software Co-design Methodology and DSP/FPGA Partitioning: A Case Study for Meeting Real-Time Processing Deadlines in 3.5G Mobile Receivers

    Get PDF
    This paper presents a DSP/FPGA hardware/software partitioning methodology for signal processing workloads. The example workload is the channel equalization and user-detection in HSDPA wireless standard for 3.5G mobile handsets. Channel equalization and user-detection is a major component of receiver baseband processing and requires strict adherence to real time deadlines. By intelligently exploring the embedded design space, this paper presents a hardware/software system-on-chip partitionings that utilizes both DSP and FPGA based coprocessors to meet and exceed the real time data rates determined by the HSDPA standard. Hardware and software partitioning strategies are discussed with respect to real time processing deadlines, while an SOC simulation toolset is presented as vehicle for prototyping embedded architectures.Nokia Inc.Texas InstrumentsNational Science Foundatio

    A General Hardware/Software Co-design Methodology for Embedded Signal Processing and Multimedia Workloads

    Get PDF
    This paper presents a hardware/software co-design methodology for partitioning real-time embedded multimedia applications between software programmable DSPs and hardware based FPGA coprocessors. By following a strict set of guidelines, the input application is partitioned between software executing on a programmable DSP and hardware based FPGA implementation to alleviate computational bottlenecks in modern VLIW style DSP architectures used in embedded systems. This methodology is applied to channel estimation firmware in 3.5G wireless receivers, as well as software based H.263 video decoders. As much as an 11x improvement in runtime performance can be achieved by partitioning performance critical software kernels in these workloads into a hardware based FPGA implementation executing in tandem with the existing host DSP.Nokia Inc.Texas InstrumentsNational Science Foundatio

    Design and Analysis of Heterogeneous DSP/FPGA Based Architectures for 3GPP Wireless Systems

    Get PDF
    This paper shows how iterative hardware/software partitioning in heterogeneous DSP/FPGA based embedded systems can be utilized to achieve real-time deadlines of modern 3GPP wireless equalization workloads. By utilizing a well defined set of application partitioning criteria in tandem with SOC simulation tools, we are able to show a greater than six fold improvement in application performance and ultimately meet, and even exceed real-time data processing deadlines

    Reconfigurable Architectures for Wireless Systems: Design Exploration and Integration Challenges

    Get PDF
    Mobile devices are severely power and area limited due to battery capacity and system size. In many of these example systems, advanced features require computationally complex signal processing on high-speed data streams for enhanced networking capabilities. Thus, mapping high-level communication and networking algorithms to system architectures is a complex and challenging procedure. An important challenge is to characterize the area, time, and power requirements of these embedded system modules and to use this information effectively to determine the architecture of programmable, reconfigurable, and fixed-function modules. In this paper, we will focus on application examples in wireless networking which highlight these challenges in reconfigurable systems integration.Nokia CorporationTexas Instruments IncorporatedNational Science Foundatio

    Reliability of Wearable-Sensor-Derived Measures of Physical Activity in Wheelchair-Dependent Spinal Cord Injured Patients

    Get PDF
    Physical activity (PA) has been shown to have a positive influence on functional recovery in patients after a spinal cord injury (SCI). Hence, it can act as a confounder in clinical intervention studies. Wearable sensors are used to quantify PA in various neurological conditions. However, there is a lack of knowledge about the inter-day reliability of PA measures. The objective of this study was to investigate the single-day reliability of various PA measures in patients with a SCI and to propose recommendations on how many days of PA measurements are required to obtain reliable results. For this, PA of 63 wheelchair-dependent patients with a SCI were measured using wearable sensors. Patients of all age ranges (49.3 ± 16.6 years) and levels of injury (from C1 to L2, ASIA A-D) were included for this study and assessed at three to four different time periods during inpatient rehabilitation (2 weeks, 1 month, 3 months, and if applicable 6 months after injury) and after in-patient rehabilitation in their home-environment (at least 6 months after injury). The metrics of interest were total activity counts, PA intensity levels, metrics of wheeling quantity and metrics of movement quality. Activity counts showed consistently high single-day reliabilities, while measures of PA intensity levels considerably varied depending on the rehabilitation progress. Single-day reliabilities of metrics of movement quantity decreased with rehabilitation progress, while metrics of movement quality increased. To achieve a mean reliability of 0.8, we found that three continuous recording days are required for out-patients, and 2 days for in-patients. Furthermore, the results show similar weekday and weekend wheeling activity for in- and out-patients. To our knowledge, this is the first study to investigate the reliability of an extended set of sensor-based measures of PA in both acute and chronic wheelchair-dependent SCI patients. The results provide recommendations for sensor-based assessments of PA in clinical SCI studies

    Estimation of Energy Expenditure in Wheelchair-Bound Spinal Cord Injured Individuals Using Inertial Measurement Units

    Get PDF
    A healthy lifestyle reduces the risk of cardio-vascular disease. As wheelchair-bound individuals with spinal cord injury (SCI) are challenged in their activities, promoting and coaching an active lifestyle is especially relevant. Although there are many commercial activity trackers available for the able-bodied population, including those providing feedback about energy expenditure (EE), activity trackers for the SCI population are largely lacking, or are limited to a small set of activities performed in controlled settings. The aims of the present study were to develop and validate an algorithm based on inertial measurement unit (IMU) data to continuously monitor EE in wheelchair-bound individuals with a SCI, and to establish reference activity values for a healthy lifestyle in this population. For this purpose, EE was measured in 30 subjects each wearing four IMUs during 12 different physical activities, randomly selected from a list of 24 activities of daily living. The proposed algorithm consists of three parts: resting EE estimation based on multi-linear regression, an activity classification using a k-nearest-neighbors algorithm, and EE estimation based on artificial neural networks (ANNs). The mean absolute estimation error for the ANN-based algorithm was 14.4% compared to indirect calorimeter measurements. Based on reference values from the literature and the data collected within this study, we recommend wheeling 3 km per day for a healthy lifestyle in wheelchair-bound SCI individuals. Combining the proposed algorithm with a recommendation for physical activity provides a powerful tool for the promotion of an active lifestyle in the SCI population, thereby reducing the risk for secondary diseases

    Persistence of engineered nanoparticles in a municipal solid-waste incineration plant

    Get PDF
    More than 100 million tonnes of municipal solid waste are incinerated worldwide every year1. However, little is known about the fate of nanomaterials during incineration, even though the presence of engineered nanoparticles in waste is expected to grow2. Here, we show that cerium oxide nanoparticles introduced into a full-scale waste incineration plant bind loosely to solid residues from the combustion process and can be efficiently removed from flue gas using current filter technology. The nanoparticles were introduced either directly onto the waste before incineration or into the gas stream exiting the furnace of an incinerator that processes 200,000 tonnes of waste per year. Nanoparticles that attached to the surface of the solid residues did not become a fixed part of the residues and did not demonstrate any physical or chemical changes. Our observations show that although it is possible to incinerate waste without releasing nanoparticles into the atmosphere, the residues to which they bind eventually end up in landfills or recovered raw materials, confirming that there is a clear environmental need to develop degradable nanoparticles

    Dynamically reconfigurable data caches in low-power computing

    No full text
    In order to curb microprocessor power consumption, we propose an L1 data cache which can be reconfigured dynamically at runtime according to the cache requirements of a given application. A two phase approach is used involving both compile time information, and the runtime monitoring of program performance. The compiler predicts L1 data cache requirements of loop nests in the input program, and instructs the hardware on how much L1 data cache to enable during a loop nest's execution. For regions of the program not analyzable at compile time, the hardware itself monitors program performance and reconfigures the L1 data cache so as to maintain cache performance while minimizing cache power consumption. In addition to this, we provide a study of data reuses inside loop nests of the SPEC CPU2000 and Mediabench benchmarks. The sensitivity of data reuses to L1 data cache associativity is analyzed to illustrated the potential power savings a reconfigurable L1 data cache can achieve

    Reconfigurable heterogeneous DSP/FPGA based embedded architectures for numerically intensive computing workloads

    No full text
    Telecommunications and multimedia form a vast segment of the embedded systems market. Variations in standards coupled with the desire for software programmability often result in software based implementations executing on DSP cores. With the advent of data intensive media and communications workloads, computational demands of the DSP are ever increasing. Despite increases in clock rates, the computational demands of many wireless and multimedia video kernels far exceeds the available pipeline arithmetic and logic unit (ALU) resources of todays DSP devices. This thesis presents a hardware/software co-design methodology for partitioning real-time embedded multimedia applications between software programmable DSPs and hardware based FPGA coprocessors. Using a strict set of guidelines, input applications are partitioned between software executing on a programmable DSP and hardware based FPGA implementation. This methodology is applied to channel estimation firmware in 3.5G wireless receivers, as well as software based H.263 video decoders. These heterogeneous systems are prototyped using a custom simulation environment created for these studies, which models bit true cycle accurate heterogeneous embedded architectures. By partitioning performance critical kernels from software on the DSP to FPGA based loosely coupled coprocessors, significant performance gains over what is possible with modern DSP architectures are shown. This thesis also investigates the instruction and data level parallelism in modern digital signal processing and multimedia workloads, and presents a retargetable compiler infrastructure for multi-clustered VLIW style digital signal processor architectures. By recompiling existing workloads, the thesis compares the performance of aggressive hardware/software partitioning between modern DSP cores, and loosely coupled FPGA based coprocessors, and the performance of massively multi-clustered VLIW style architectures. The compiler infrastructure allows existing DSP kernels to be retargeted for user defined machine definitions. In doing this, the thesis shows that increased hardware parallelism within the DSP core can yield significant performance gains, as well as the amount of hardware necessary to compete with FPGA based performance. In conclusion, the thesis advocates application specific DSP design with increased hardware parallelism for modern signal processing and multimedia workloads, as well as loosely coupled hardware based coprocessors for truly high performance computing in these domains
    corecore